This article is about a technique which can come quite handy when one wants to build multiple time series models for the time series which have inherently heirarchical structure.
Time series can often be naturally disaggregated in a hierarchical structure using attributes such as geographical location, product type, etc. For example, the total number of Member of Parliaments(MPs) in a given election can come from different States and in turn given a particular State ,from different cities,districts and so forth.Such disaggregation imposes a hierarchical structure. We refer to these as hierarchical time series.
Another possibility is that series can be naturally grouped together based on attributes without necessarily imposing a hierarchical structure. For example the MPs in the above context can be filtered down also on the basic of Sex viz.Male ,Female and Others. Grouped time series can be thought of as hierarchical time series that do not impose a unique hierarchical structure in the sense that the order by which the series can be grouped is not unique.
I will try and focus more on the intuition rather than the mathematical details here but if one is interested in the same please refer here
Let’s plot some heirarchical time series below.This one pertains to Total quarterly visitor nights from 1998-2011 for eight regions of Australia:
I will show below some graphs and tables pertaining to the Data at hand so as to expose it better.
| Time Series Title | Description |
|---|---|
| Sydney | The Sydney metropolitan area. |
| NSW | New South Wales other than Sydney |
| Melbourne | The Melbourne metropolitan area. |
| VIC | Victoria other than Melbourne. |
| BrisbaneGC | The Brisbane and Gold Coast area. |
| QLD | Queensland other than Brisbane and the Gold Coast. |
| Capitals | The other five capital cities: Adelaide, Hobart, Perth, Darwin and Canberra. |
| Other | All other areas of Australia. |
| Sydney | NSW | Melbourne | VIC | BrisbaneGC | QLD | Capitals | Other | |
|---|---|---|---|---|---|---|---|---|
| 1998 | 7320 | 21782 | 4865 | 14054 | 9055 | 8016 | 9178 | 10232 |
| 1998.25 | 6117 | 16881 | 4100 | 8237 | 5616 | 8461 | 6362 | 9540 |
| 1998.5 | 6282 | 13495 | 4418 | 6731 | 8298 | 13175 | 7965 | 12385 |
| 1998.75 | 6368 | 15963 | 5157 | 7675 | 6674 | 9092 | 6864 | 13098 |
| 1999 | 6602 | 22718 | 5550 | 13581 | 9168 | 10224 | 8908 | 10140 |
| 1999.25 | 5651 | 14775 | 3902 | 7883 | 7351 | 9672 | 7690 | 9948 |
I promise you guys will try to keep the focus more on the intuitive part relative to the theory bu more advanced/curious folks can refer here
Summary of the 3 methods broadly which can be used to forecast hierarchical and grouped time series.
Summary Forecast Methods
| Forecast Method | Description |
|---|---|
| bottom-up approach | This approach involves first generating base independent forecasts for each series at the bottom level of the hierarchy and then aggregating these upwards. |
| top-down approach | Top-down approaches involve first generating base forecasts for the “Total” series top of the hierarchy and then disaggregating downwards. |
| Middle-out approach | A hybrid of the above two approaches. |
Briefly wiill go try and explain the above methods in our context in laymanish terms.
bottom-up approach: Here this means we will forecast lower most level of the Heirarchy i.e cities and then aggreagte the results up the heirarchy.
Now implementing this is even simpler.I will demostrate a working example via one of the above techniques and rest you can carch up via the documentaion of hts package in R language.
y <- hts(vn, nodes=list(4,c(2,2,2,2)))
The above command creates a heirarchical time series with 3 levels(top most level one does not have to specify) with 4 nodes in the middle and 8 nodes in bottom most level.(basically the argument ‘nodes’ does the trick for you here.)
allf <- forecast(y, h=8,method = 'tdfp',fmethod = 'ets')#here tdfp means top-down forecast #proportions
plot(allf)
The above command will give you forecasts across all levels in the heirarchy using top-down forecast proportions approach discussed above.
Just to reiterate:
One can go futher deep in the documentation here and try out different techniques with different parameter combinations.